Data-Oriented Methods for Grapheme-to-Phoneme Conversion
نویسندگان
چکیده
It is traditionally assumed that various sources of linguistic knowledge and their interaction should be formalised in order to be able to convert words into their phonemic representations with reasonable accuracy. We show that using supervised learning techniques, based on a corpus of transcribed words, the same and even better performance can be achieved, without explicit modeling of linguistic knowledge. In this paper we present two instances of this approach. A first model implements a variant of instance-based learning, in which a weighed similarity metric and a database of prototypical exemplars are used to predict new mappings. In the second model, grapheme-to-phoneme mappings are looked up in a compressed text-to-speech lexicon (table lookup) enriched with default mappings. We compare performance and accuracy of these approaches to a connectionist (backpropagation) approach and to the linguistic knowledge-based approach. 1 I n t r o d u c t i o n Grapheme-to-phoneme conversion is a central task in any text-to-speech (reading aloud) system. Given an alphabet of spelling symbols (graphemes) and an alphabet of phonetic symbols, a mapping should be achieved transliterating strings of graphemes into strings of phonetic symbols. It is well known that this mapping is difficult because in general, not all graphemes are realised in the phonetic transcription, and the same grapheme may correspond to different phonetic symbols, depending on context. It is traditionally assumed that various sources of linguistic knowledge and their interaction should be formalised in order to be able to convert words into their phonemic representations with reasonable accuracy. Although different researchers propose different knowledge structures, consensus seems to be that at least morphological and phonotactic knowledge should be incorporated in order to be able to find morphological and syllable structure. These structures are deemed necessary to define the proper domains for phonological and phonetic rules. As a typical architecture for grapheme-to-phoneme conversion in Dutch, consider the modules in [Daelemans, 1988] shown in Figure 1. It contains most of the traditional datastructures and processing components proposed by computational linguists. A problem with this approach is that the knowledge needed is highly language-dependent and requires a significant amount of linguistic engineering. We argue that using data-oriented learning techniques on a corpus of transcribed words (information. which is readily available in many machine-readable dictionaries), the same and even better performance can be achieved, without explicit modeling of linguistic knowledge. The advantages of such an approach are that the technique is reusable for different sets of data (e.g. different languages or sublanguages), and that it is automatic (no explicit linguistic engineering is needed to handcraft the rules and knowledge structures necessary for implementing the target mapping). In this paper we present two instances of this approach in the domain of Grapheme-to-Phoneme conversion. A first model implements a variant of instance-based learning, in which a similarity metric (weighed by using a metric based on information entropy) and a database of prototypieal exemplars are
منابع مشابه
Language-independent Data-oriented Grapheme-to-phoneme Conversion
We describe an approach to grapheme-to-phoneme conversion which is both language-independent and data-oriented. Given a set of examples (spelling words with their associated phonetic representation) in a language, a grapheme-to-phoneme conversion system is automatically produced for that language which takes as its input the spelling of words, and produces as its output the phonetic transcripti...
متن کاملLanguage � Independent Data � Oriented Grapheme
We describe an approach to grapheme to phoneme conver sion which is both language independent and data oriented Given a set of examples spelling words with their associated phonetic representation in a language a grapheme to phoneme conversion system is automatically pro duced for that language which takes as its input the spelling of words and produces as its output the phonetic transcription ...
متن کاملA language-independent, data-oriented architecture for grapheme-to-phoneme conversion
We report on an implemented grapheme to phoneme conversion architecture Given a set of examples spelling words with their associated phonetic represen tation in a language a grapheme to phoneme conversion system is automatically produced for that language which takes as its input the spelling of words and pro duces as its output the phonetic transcription according to the rules implicit in the ...
متن کاملConversion from phoneme based to grapheme based acoustic models for speech recognition
This paper focuses on acoustic modeling in speech recognition. A novel approach how to build grapheme based acoustic models with conversion from existing phoneme based acoustic models is proposed. The grapheme based acoustic models are created as weighted sum from monophone acoustic models. The influence of particular monophone is determined with the phoneme to grapheme confusion matrix. Furthe...
متن کاملA Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion
A finite-state method, based on leftmost longestmatch replacement, is presented for segmenting words into graphemes, and for converting graphemes into phonemes. A small set of hand-crafted conversion rules for Dutch achieves a phoneme accuracy of over 93%. The accuracy of the system is further improved by using transformation-based learning. The phoneme accuracy of the best system (using a larg...
متن کاملRule-based Korean Grapheme to Phoneme Conversion Using Sound Patterns
Grapheme-to-phoneme conversion plays an important role in text-to-speech applications and other fields of computational linguistics. Although Korean uses a phonemic writing system, it must have a grapheme-to-phoneme conversion for speech synthesis because Korean writing system does not always reflect its actual pronunciations. This paper describes a grapheme-to-phoneme conversion method based o...
متن کامل